AITopics | fragmentation-coagulation process

Scalable imputation of genetic data with a discrete fragmentation coagulation process

Neural Information Processing SystemsMar-14-2024, 11:03:49 GMT

We present a Bayesian nonparametric model for genetic sequence data in which a set of genetic sequences is modelled using a Markov model of partitions. The partitions at consecutive locations in the genome are related by the splitting and merging of their clusters. Our model can be thought of as a discrete analogue of the continuous fragmentation-coagulation process [Teh et al 2011], preserving the important properties of projectivity, exchangeability and reversibility, while being more scalable. We apply this model to the problem of genotype imputation, showing improved computational efficiency while maintaining accuracies comparable to other state-of-the-art genotype imputation methods.

dfcp, partition, sequence, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Modelling Genetic Variations using Fragmentation-Coagulation Processes

Neural Information Processing SystemsApr-6-2023, 12:56:27 GMT

We propose a novel class of Bayesian nonparametric models for sequential data called fragmentation-coagulation processes (FCPs). An FCP is exchangeable, projective, stationary and reversible, and its equilibrium distributions are given by the Chinese restaurant process. As opposed to hidden Markov models, FCPs allow for flexible modelling of the number of clusters, and they avoid label switching non-identifiability problems. We develop an efficient Gibbs sampler for FCPs which uses uniformization and the forward-backward algorithm. Our development of FCPs is motivated by applications in population genetics, and we demonstrate the utility of FCPs on problems of genotype imputation with phased and unphased SNP data.

fcp, fragmentation-coagulation process, modelling genetic variation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Modelling Genetic Variations using Fragmentation-Coagulation Processes

Teh, Yee W., Blundell, Charles, Elliott, Lloyd

Neural Information Processing SystemsFeb-14-2020, 22:13:23 GMT

We propose a novel class of Bayesian nonparametric models for sequential data called fragmentation-coagulation processes (FCPs). An FCP is exchangeable, projective, stationary and reversible, and its equilibrium distributions are given by the Chinese restaurant process. As opposed to hidden Markov models, FCPs allow for flexible modelling of the number of clusters, and they avoid label switching non-identifiability problems. We develop an efficient Gibbs sampler for FCPs which uses uniformization and the forward-backward algorithm. Our development of FCPs is motivated by applications in population genetics, and we demonstrate the utility of FCPs on problems of genotype imputation with phased and unphased SNP data.

fcp, fragmentation-coagulation process, modelling genetic variation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Scalable imputation of genetic data with a discrete fragmentation-coagulation process

Elliott, Lloyd, Teh, Yee W.

Neural Information Processing SystemsDec-31-2012

We present a Bayesian nonparametric model for genetic sequence data in which a set of genetic sequences is modelled using a Markov model of partitions. The partitions at consecutive locations in the genome are related by their clusters first splitting and then merging. Our model can be thought of as a discrete time analogue of continuous time fragmentation-coagulation processes [Teh et al 2011], preserving the important properties of projectivity, exchangeability and reversibility, while being more scalable. We apply this model to the problem of genotype imputation, showing improved computational efficiency while maintaining the same accuracies as in [Teh et al 2011].

Add feedback

Filters

Collaborating Authors

fragmentation-coagulation process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Scalable imputation of genetic data with a discrete fragmentation coagulation process

Modelling Genetic Variations using Fragmentation-Coagulation Processes

Modelling Genetic Variations using Fragmentation-Coagulation Processes

Scalable imputation of genetic data with a discrete fragmentation-coagulation process